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Sentiment classification (SC) is an important research field in natural 
language processing (NLP) that classifying, extracting and recognizing 
subjective information from unstructured text, including opinions, 
evaluations, emotions, and attitudes. Human-robot interaction (HRI) also 
involves natural language processing, knowledge representation, and 
reasoning by utilizing deep learning, cognitive science, and robotics. 
However, sentiment classification for HRI is rarely implemented, especially 
to navigate a robot using the Indonesian Language which semantically 
dynamics when written in text. This paper proposes a sentiment 
classification of Bahasa Indonesia that supports the delta robot to move in 
particular trajectory directions. Navigation commands of the delta robot 
were vectorized using a word embedding method containing two- 
dimensional matrices to propose the classifier pattern such as convolutional 
neural network (CNN). The result compared the particular architecture of 


CNN, GloVe-CNN, and Word2Vec-CNN. As a classifier method, CNN 
models trained, validated, and tested with higher accuracy are 98.97% and 
executed in less than a minute. The classifier produces four navigation 
labels: right means 'kanan', left means 'kiri', top means 'atas', bottom means 
‘bawah', and multiplier factor. The classifier result is utilized to transform 
any navigation commands into direction along with end-effector coordinates. 
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1. INTRODUCTION 

A multi-robot communication effectively carries out more tasks instead of robot-to-robot 
communication. A multi-robot coordination system is feasible to control motion-coordination tasks for large- 
scale swarm robots [1]. Multi-robot systems may be used in various circumstances, including drone delivery, 
agricultural, search-and-rescue, disaster relief, and defense [2]. Nowadays, robots are able to communicate 
not only to other robots but also communicate to humans. Socially interactive robots must be capable of 
participating in human-like discussions. In this context, human-robot interaction (HRI) also involves natural 
language processing, knowledge representation, and reasoning [3]. 

Sentiment classification is an intriguing and essential research issue in natural language processing 
(NLP) [4]. The sentiment classification task obtains the sentiment polarity contained in the sentence while the 
sentence information contained in the word is incomplete [5]. The sentence encoder is required to extract 
features of the sentence to generate the vector representation of the sentence. Sentiment classification 
attempts to classify opinionated material automatically which consists of analyzing opinions [6], evaluations, 
emotions, attitudes, evaluations, and other sentiment entities. Although the literature on sentiment 
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classification is fairly rich, sentiment classification for HRI is seldom applied, particularly when navigating a 
robot using the Indonesian language, which is semantically dynamic when expressed in text. 

An Indonesian language sentiment classification has been implemented in a lot of applications such 
as movie reviews, presidential elections [7], and hotel reviews. Certain approaches utilized as sentiment 
models such as Naive Bayes (NB), support vector machine (SVM) [8], and other deep learning models such 
as deep neural network (DNN) [9], and convolutional neural network (CNN) [10]. Word embedding is a 
technique that learns continuous low-dimensional vector space of word representations by using contextual 
information from a substantial corpus [11]. The vector representation of words using unsupervised techniques 
[12] has proven to be very effective in explaining the meaning of the sentiment. Word embedding has several 
types such as Word2Vec, GloVe [13], Re(Word2Vec), and Re(GloVe) [14]. To enhance the performance of 
distinguishing words, the word embedding method was improved into deep learning model performance such 
as CNN, and bidirectional long-short-term memory (Bi-LSTM). 

Delta robot trajectory planning has been studied using particular methods such as numerical 
algorithm, geometrical method, genetic algorithm (GA), fuzzy algorithm, particle swarm optimization (PSO), 
and artificial neural network (ANN) [15]. However, none of the methods integrate with sentiment 
classification. Sentiment classification for HRI is rarely implemented, especially for navigating delta robot 
trajectory planning. The Delta robot has been selected as the research object that manages the trajectory point 
in the coordinate space of the end-effector (x, y ). 


Therefore, this paper proposed a sentiment classification of Bahasa Indonesia to produce the Delta 
robot trajectory planning using word embedding and CNN. The word embedding (Word2Vec and GloVe) 
produces text patterns as an initial feature of CNN inputs with n text dimension. The CNN classifier 
produces four navigation labels: right means 'kanan', left means 'kiri', top means 'atas', bottom means 
'bawah', and multiplier factor. The output of the CNN classifier to support the direction of delta robot 
trajectory planning. By giving the trajectory set point, the delta robot is able to navigate the subsequent 
successive trajectory within the text command. 


2. RESEARCH METHOD 

The self-learning of the delta robot has been developed using Inverse Kinematics (IK) and artificial 
neural networks (ANN) [15]. This research is to improve the input by providing a sentiment classification of 
given semantics in Bahasa Indonesia which involves three stages which contain dataset collection, dataset 
pre-processing, and classification process. The desired output provides the navigation label to the ANN 
layers. Figure 1 shows the Bahasa Indonesia sentiment classification research method to produce the delta 
robot trajectory planning using word embedding and CNN. 
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Figure 1. Research method 
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Our dataset was collected by distributing the questioner to 46 people. The navigation commands are 
manually labeled into four labels. The Pre-processing stage removes noise inputs to the classifier model to 
produce high accuracy of the classifier. Furthermore, we embed the clean navigation command to the word 
embedding vectors such as Glove and Word2Vec. This stage produces 50 dimensions of the image vector. 
This paper tuning the CNN hyperparameter to analyze the preferred CNN model given by the dataset 
distribution on the training, validation, and testing processes. Additionally, this paper visualizes the well- 
tuned CNN models to the Delta robot trajectory control, which utilizes the IK-ANN to navigate the end- 
effector coordinates given by the sentiment classifier. 


2.1. Dataset collection method 

Questionnaire distribution proposed to collect delta robot navigation commands dataset. navigation 
commands collected consist of four navigation labels: left 'kiri', right 'kanan', up 'atas', and down 'bawah'. 
Figure 2(a) represents the questionnaire image form for model 1, Figure 2(b) represents the questionnaire 
image form for model 2, and Figure 2(c) represents the questionnaire image form for model 3. The 
Quistionnaire image utilized to help the audience filling the navigation command form. 

Table 1 shows the navigation commands to Figure 2, these collected commands are utilized to 
navigate the delta robot to pick up the yellow box to the coordinate of the blue box. The navigation 
commands collected in 2024 rows of particular command labels containing: 516 rows of left 'kiri' command, 
496 rows of right 'kanan' command, 598 rows of top 'atas' command, and 414 rows of bottom 'bawah' 
command. The proposed method will be performed using delta robot navigation dataset. 


Figure 2. Questionnaire image of Delta robot navigation commands (a) model 1, (b) model 2, and (c) model 3 


2.2. Dataset preprocessing 

Table 1 shows the dataset of delta robot navigation commands collected by distributing the 
questionnaire of given images. The navigation command that was collected was stored as unstructured 
sentences and unable to classify directly. Unstructured data is hard to classify and yields poor accuracy. Raw 
sentences require to be preprocessed to produce clean sentences as a classifier input [16]. In Indonesian 
langeage, a particular sentence can be derived denotative sentence, for instance, a navigation command "satu 
kotak kekanan', 'kekanan satu langkah', 'pindah kekanan' that means 'one step to the right'. The desired 
sentence refers to 'ke kanan' checked on Indonesian Dictionary or Kamus Besar Bahasa Indonesia (KBBI). 

Preprocessing contains sub-process such as manually labeling, case folding, removing punctuations, 
removing a single character, removing white space, replacing slang words, tokenization, removing stop- 
word, stemming, and word vectorization [17]. The aim of text preprocessing is to remove a noise input to the 
classifier method to produce high accuracy containing case folding, remove slang words, remove puctuations, 
tokenization, stop word removal, stemming, multiplier extraction, and word vectorization. 

Case folding converts input sentences into lowercase without exception. Replacing slang words 
when the audience writes 'kekanan', 'kknan', 'knn', 'kanaaaaaannn' is wrong, because it should be written as 
'ke kanan'. On the other hand, most of the audience wrote 'sangat', 'banget', 'bgd', 'bangeeedddd', and 
"...sekali', 'bangedh', which mean 'really'. This process replaces slang words into a word that is found in the 
Indonesian dictionary (KBBI). Furthermore, another process is remove punctuations and white space such as 
~! @#$%*&*()_+{}:"2><;'/.,[]\=-. Tokenization is a process of segregating a sentence into a word vector 
separated by space. e.g ‘one step to the right’ or “satu langkah kekanan’ into ‘satu’, ‘langkah’, and 
‘kekanan’. Stop word removal is a process of searching and removing certain words that do not affect the 
desired classifier, such as conjunctions [18] (e.g 'dan' means 'and', 'lalu' means 'then 'selanjutnya' means 
'furthermore'). Stemming is a process to returns the original form of the word, which contains prefixes and 
suffixes [19]. For instance, '...ke kanan' into '...kanan' means '...to the right', '...ke kiri' into '...kiri' means 
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'..to the left, '...ke atas' into 'atas' means '...upward'. Multiplier extraction is a process to extract the 
multiplier command given by the stemming outputs such as 'satu' into '1' mean ‘one’, 'dua' into '2' mean 'two'. 
Word Vectorization is a process of vectorizing/weighting a word into random sequence word, for instance, 
'kanan dua langkah' into (2, 3, 7), and 'empat kanan' into (6, 2). Word vectorization optimizing the word 
representation/word embedding method into a dimension of the word. 


Table 1. Collected navigation commands 


Labels 
Commands left right top bottom 
'kiri' 'kanan' ‘atas’ 'bawah' 
Delta Robot, please move the yellow box one step to the right 0 1 0 0 
e “Delta robot, pindahkan kotak kuning 1 langkah kekanan” 
e “Delta robot, pindahkan kotak kuning satu langkah ke kanan” 
e ‘pindahkan ke kanan satu kotak’ 
e “diawali dgn memindahkan 1 kotak ke kanan” 
e.. 
Then, two boxes up 0 0 1 0 
e ‘kemudian, 2 kotak ke atas’ 
e ‘kemudian, 2 kotak keatas’ 
e ‘alu, 2 kotak ke atas’ 
e ‘2 kotak ke atas’ 
e... 
Furthermore, one step to the right 0 1 0 0 
e ‘lalu, satu box ke kanan’ 
e ‘dan, satu langkah kekanan’ 
e ‘satu kotak ke kanan’ 
e ‘knan 1’ 
o... 
Finally, drop the yellow box by moving one box up 0 0 1 0 


e ‘terakhir, satu box ke atas" 

èe ‘terakhir, pindahkkan satu kotak ke atas" 

e ‘kemudian, satu langkah ke atas sampai di box biru’ 

e ‘pindahkan kotak kuning satu langkah ke atas, sampai di kotak biru’ 
e 


2.3. Word embedding and classifier 

Convolution is a common matrix multiplication that consists of a convolution layer, pooling layer, 
and fully connected layer. CNN's inherently handle variable-size sentences through pooling operations, and 
they also consider the order of words and the context in which each word appears. An opposite understanding 
of CNN in image classification, where a high number of convolutional layers such as VGG, AlexNet, 
MobileNet, and ResNet, are used. These networks include many neural networks that extract abstract features 
from images and require huge memory, huge computational requirements during the training process, and 
require more computationally intensive networks to produce higher accuracy based on the layers used. In the 
context of text classification, at least a single convolution layer is sufficient to be implied. Figure 3 illustrates 
the architecture of a convolutional neural network (CNN). 


‘Es 
SE 
aS 
AS 
| aa | 
a Ren 
Ka! FERE 
oS) ian w | 
cH. ies ez: 
Sa DES) 
Be Si iS ey 
Q| iSo e i 
| it 3 Bi 
i iaao; 
h Fully-Connected |2 z Ti 
H = 
Global Layer | g E 
Max Pooling 1 to n S 5 
Word Embedding ( using Convolutional 1 to n 
Word2Vec / GLoVe within n dim) filter size = fs, kernel size = ks, strides = s, 
and Features Input (x x n) padding = p 


Figure 3. The architecture of convolutional neural network with word embedding 
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Since w € Rdenoted as word representation matrices in d-dimensional Word2Vec or GloVe. The 


convolutional filters denote asw E Rr h denote as window filters, and input matrices denote as Xii+h-1- 
Therefore, Convolutional filters in a word can be formalized [20]. 


Can Xasnta ntin) (1) 
As shown in (4) produces a feature map c € p41 Moreover, the activation function can be expressed as: 

Cj = max( 0, WXii+h-1) (2) 
therefore, the feature map function of the convolution operation can be expressed as (3). 

c = (C4, C2, +++) Cit4n—1] (3) 

Table 2 shows hyperparameters utilized in the performance analysis of our proposed models. 


Convolutional layers number has assorted hyperparameters such as word embedding dimension, the number 
of filters, batch size, the number of epochs, and the number of convolutional layers. 


Table 2. CNN hyperparameters 


Parameters Values 
Word Embedding Dimension 50 
Number of Convolution Layers 2,3,5,7,9 
Pooling Layers GlobalMaxPooling, MaxPooling 
Number of Filters 10, 16, 32, 64, 128, 256 
Filter Size D3 ST 
Number of Fully-Connected Layer [21] 1 
Activation Function ReLu, Softmax 
Optimizer Adam 
Dropout [10] 0.6 
Regularizer [21] L2 
Batch Size 16, 32, 64, 128, 256 
Number of Epoch 50 


Performance metrics are utilized to show the capability of the classifier models, such as the accuracy 
among the dataset distribution for training, validation, and testing. Accuray entails the correct and incorrect 
predictions of the proposed model, which entails the confusion matrix such as the number true positive (TP), 
false positive (FP), true negative (TN), and false negative (FN) [22]. The area under the curve (AUC) of the 
receiver operating characteristic (ROC) or Precision-Recall (PR) curves are quality measures of binary 
classifiers. Unlike the accuracy, AUC evaluates all the operational points of a model such as sensitivity, 
specificity, positive predictive value (PPV), and negative predictive value (NPV) [23]. 


2.4. Delta robot 

The delta robot structure commonly constructs fixed and driven frames. R(O — xyz) Illustrates the 
fixed frame located in the middle of the upper frame ABC. R'(O — xyz)' is a reference frame located in the 
end-effector frame P,P, P3. Z-axis is perpendicular to the end-effector, Y-axis is parallel to P30 [15]. Figure 
4(a) shows the delta robot frame and Figure 4(b) shows the delta robot mechanical design [15], [24]. 

Inverse kinematics (IK) are used in robot control system to transform join space to angular space 
[24]. In the delta robot control system, IK inputs related to the end-effector coordinate on the base frame 
R'(O — xyz)'to achieve direction of the sliders on the Z-axis Z,Z,Z [24] can be formalized as [15], 


24-2 ff (x -= x)? -O -y (4) 


where Z;; denotes as the slider direction on Z-axis and L;denotes as the link length between B;P;. A self- 
learning delta robot successfully solved an Inverse Kinematic by using a deep learning approach such as 
ANN. ANN configures to maintain the prediction of the joint angle of the end-effector given by the 
coordinates [15]. 
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Figure 4. Delta robot of (a) frame and (b) mechanical [15], [25] 


3. RESULTS AND DISCUSSION 

This section shows the result of text preprocessing, text classification results using particular CNN 
models, and delta robot navigation based on text classification results. The experiment generated 50 
dimensions of GloVe and Word2Vec to be trained on proposed models. The CNN models in Table 3, trained 
with optimal parameters of convolutional layers, filters, and kernels. Meanwhile, selected activation function 
is “ReLu”, selected regularization is ‘Dropout’ with value equal to 0.6 [10], and selected batch size 64 [26]. 


Table 3. Proposed CNN models 


Models Conv Layers Filters Size Kernels Size 
Model 1 [7] 5 64 3 
Model 2 [10] 3 512 6,7,8 
Model 3 3 128 5 
Model 4 [21] 3 128 7,4,3 
Model 5 [27] 4 90 2,3,4,5 
Model 6 [25] 3 150 3,5,7 
Model 7 3 64 3 
Model 8 5 128 3 
Model 9 7 128 2 
Model 10 5 256 2,3,4,5,6 


3.1. Text preprocessing results 

Figure 5 represents text processing result containing raw text, preprocessing result, and after 
multiplier extraction. Figure 5(a) shows the raw text containing 516 rows of left 'kiri' command, 496 rows of 
right 'kanan' command, 598 rows of up 'atas' command, and 414 rows of down 'bawah' command. Figure 
5(b) represent text preprocessing results which consists of the following process such as: case folding, 
removing punctuation, replacing slang words, tokenization, stop-word removal, stemming, and vectorizing to 
word embedding matrices. Figure 5(c) represents the word cloud after multiplier extraction, which contains 
multiplication values such as 1 'one' or 'satu', 2 'two' or 'dua', ...., 9 'nine or 'sembilan'. The multiplier value 
will be used to multiply the number of steps follows a selected set point in coordinate x and Y. 


3.2. Classifier performance comparison 

The CNN model assessment contains the training and validation process. Therefore, the dataset 
requires to be partitioned into training datasets and validation datasets (20% of the training dataset). This 
experiment uses 10 different CNN models. Hence, to compare CNN model performance, particular metrics 
can be used, such as accuracy, validation of accuracy, AUC, and execution time. Figure 6 shows several 
CNN observations graphs of accuracy in 50 iterations. Table 4, Table 5, and Table 6 were recorded after 
training and evaluation for each proposed CNN model ten times. 
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Figure 6. CNN Observation graph of accuracy 


Table 4. CNN model performance 


Accuracy 


Models Tr. Val. Ts. AUC Exec. Time (s) 
Model 1 [7] 0.9458 0.9532 0.9938 0.9228 17.6681 
Model 2 [10] 0.9678 0.9726 0.9938 0.9517 39.7075 
Model 3 0.9598 0.9618 0.9938 0.9433 18.4506 
Model 4 [21] 0.9732 0.9772 0. 9938 0.9621 18.3931 
Model 5 [27] 0.9671 0.9698 0.9938 0.9495 18.2123 
Model 6 [25] 0.9705 0.9701 0. 9938 0.9569 19.5357 
Model 7 0.9361 0.9561 0.9938 0.9308 16.1871 
Model 8 0.9630 0.9656 0.9938 0.9393 21.7301 
Model 9 0.9676 0.9717 0. 9938 0.9456 23.1782 
Model 10 0.9561 0.9602 0. 9938 0.9292 48.0551 

Table 5. CNN-GloVe model performance 
Models Tr. aa kad Ts. AUC Exec. Time (s) 
Model 1 [7] 0.9817 0.9868 0.9876 0.9875 18.8061 
Model 2 [10] 0.9897 0.9952 0.9938 0.9952 57.1364 
Model 3 0.9851 0.9937 0.9877 0.9989 18.7944 
Model 4 [21] 0.9877 0.9925 0.9876 0.9921 19.3583 
Model 5 [27] 0.9817 0.9908 0.9938 0.9879 19.2866 
Model 6 [25] 0.9872 0.9891 0.9938 0.9929 19.8789 
Model 7 0.9667 0.9817 0. 9938 0.9802 17.1706 
Model 8 0.9874 0.9917 0.9938 0.9932 22.3027 
Model 9 0.9825 0.9779 0.9938 0.9879 23.1683 
Model 10 0.9895 0.9925 0.9938 0.9941 68.7105 
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Table 6. CNN-Word2Vec model performance 


Models Tr. Ee Ts. AUC Exec. Time (s) 
Model 1 [7] 0.9606 0.9718 0.9876 0.9595 18.7656 
Model 2 [10] 0.9352 0.9431 0.9938 0.9128 55.7453 
Model 3 0.9395 0.9503 0.9876 0.9128 18.4844 
Model 4 [21] 0.9674 0.9739 0.9938 0.9673 18.8307 
Model 5 [27] 0.9563 0.9673 0.9938 0.9459 18.9545 
Model 6 [25] 0.9645 0.9701 0.9938 0.9546 19.1649 
Model 7 0.9368 0.9579 0.9937 0.9395 16.9230 
Model 8 0.9654 0.9703 0.9937 0.9515 21.3534 
Model 9 0.9697 0.9654 0.9939 0.9535 21.7052 
Model 10 0.9497 0.9545 0.9936 0.9275 68.9570 


Based on Table 4, whole models were observed with batches of the dataset, which contains training 
dataset (Tr.), validation dataset (Val.), and testing dataset (Ts.). The proposed models perform with great 
accuracy that is higher than 93% until 97.32%. AUC models are calculated to measure the tradeoff between 
true-positive and false-positive rates, representing the equality of positive and negative labels. Model 2 and 
Model 10 were trained longer than the other models because they have a higher filter and kernel size. 

Table 4 is the performance comparison between particular CNN models without word embedding 
matrices. Meanwhile, Tables 5 and 6 show the CNN model embedded by word representation matrices such 
as GloVe and Word2Vec. By representing preprocessed words into 50 dimensions of word embedding 
matrices, the weights of standard CNN might be modified by using these matrices. The performance of 
GloVe-CNN and Word2Vec CNN successfully improves the performance of a standard CNN model. 


3.4. System testing 

This section explains the whole testing of the classifier CNN and DNN of the delta robot. By 
selecting three navigation commands based on Figure 1, navigation command sequences pass the text 
preprocessing step and its classifier. Table 7 shows the extractions of the following results of SA results and 
its multiplier. 


Table 7. Navigation commands testing 


No Seq. Nav. Commands Clean Txt. Extract. Nav. Multiplier 
1 move one step to the right Landn ahe ‘kanan’ 1 
“pndah stu ktak kknan’ i [1,0,0,0 
move two steps to the toj < pr “atas” 
“dua ktak ee E atas dua [0, 0, 1,0 2 
to the right be oe cei ‘kanan’ 1 
“kknan” ‘ [1, 0, 0, 0 
to the top X : ‘atas’ 
‘trs keatas’? MASSA [0, 0, 1,0 | 
2 move two boxes to the bottom ‘bawah dua’ ‘bawah’ 2 
‘gser 2 ktak kbwah’ [0, 0, 0, 1 
two boxes to the right ‘kanan dua’ ‘kanan’ 2 
‘2 ktak knan’ [1,0,0,0 
move two boxes to the bottom ‘bawah dua’ ‘bawah’ 2 
‘gser 2 ktak kbwah lg” [0, 0, 0, 1 
two boxes to the right ‘kanan dua’ ‘kanan’ 2 
‘2 ktak kknan’ [1,0,0,0 
3 four boxes to the left ‘kiri empat’ ‘kiri’ 4 
‘pndah empat step kkri” [0, 0, 0, 1 
move three boxes to the top ‘atas tiga’ “atas' 3 
“pndah 3 step keats” [0, 0, 1,0 
two boxes to the left “kiri dua' kiri” 2 
“duaa step kkri’ [0, 0, 0, 1 
two boxes to the top ‘atas dua’ ‘atas’ 2 
‘ats 2 step’ [0,0,1,0 


By sequencing sets of navigation commands to move yellow box coordinate into blue box 
coordinate (shown in Figure 1), the classifier result has given a binary pattern that indicates navigation labels. 
The direction along the x-axis positive notated as right 'kanan' direction or in binary representation can be 
written as 1000. The path along the x-axis negative notated as left 'kiri' direction or in binary representation 
can be written as 0100. The path along the y-axis positive notated as top 'atas' direction or in binary 
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representation can be written as 0010. The direction along the y-axis negative notated as bottom 'bawah' 
direction or in binary representation can be written as 0001. The multiplier factor indicates the number of 
steps that the delta robot moves the set-point coordinate into the end-effector coordinate. 


4. CONCLUSION 

Human-robot interaction (HRI) successfully integrated using natural language understanding (NLU). 
Sentiment classification built to classify navigation commands for Delta Robot. Word embedding method 
(GloVe and Word2Vec) vectorized words into 50-dimensional vectors. These vectors are enabled to be 
trained in particular CNN Models. CNN Models were evaluated with several layers, filters, and kernels. CNN 
results contain four labels such as navigation command to the right 'kanan', navigation command to the left 
'kiri', navigation command to the top ‘atas', and navigation command to the bottom 'bawah'. Subsequently, 
the multiplier is extracted for each navigation command input. The multiplier multiplies the step of a given 
set point in the x-axis and y-axis. CNN models perform great with higher accuracy 98.97% and execution 
time less than a minute. A transformation from CNN classifier to delta robot z-sliders handled by DNN. 
DNN input manages set point x and y, multiplier, and CNN labels. Our DNN models also trained with great 
accuracy equals 96.43%. 
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